Detecting high-quality posts in community question answering sites

نویسندگان

  • Yuan Yao
  • Hanghang Tong
  • Tao Xie
  • Leman Akoglu
  • Feng Xu
  • Jian Lu
چکیده

Community question answering (CQA) has become a new paradigm for seeking and sharing information. In CQA sites, users can ask and answer questions, and provide feedback (e.g., by voting or commenting) to these questions/answers. In this article, we propose the early detection of high-quality CQA questions/answers. Such detection can help discover a high-impact question that would be widely recognized by the users in these CQA sites, as well as identify a useful answer that would gain much positive feedback from site users. In particular, we view the post quality from the perspective of the voting outcome. First, our key intuition is that the voting score of an answer is strongly positively correlated with that of its question, and we verify such correlation in two real CQA data sets. Second, armed with the verified correlation, we propose a family of algorithms to jointly detecting the high-quality questions and answers soon after they are posted in the CQA sites. We conduct extensive experimental evaluations to demonstrate the effectiveness and efficiency of our approaches. Overall, our algorithms can outperform the best competitor in prediction performance, while enjoying linear scalability with respect to the total number of posts. 2015 Elsevier Inc. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Retrieving Rising Stars in Focused Community Question-Answering

In Community Question Answering (CQA) forums, there is typically a small fraction of users who provide high-quality posts and earn a very high reputation status from the community. These top contributors are critical to the community since they drive the development of the site and attract traffic from Internet users. Identifying these individuals could be highly valuable, but this is not an ea...

متن کامل

Finding High-quality Content in Social Media with an Application to Community-based Question Answering

The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability of such content increases, the task of identifying high-quality content in sites based on user contributions—social media sites—becomes increasingly important. Social media in general exhibit a rich variety of information sources: in addition to the content itself, there is a wide arra...

متن کامل

Community-Based Question Answering via Asymmetric Multi-Faceted Ranking Network Learning

Nowadays the community-based question answering (CQA) sites become the popular Internet-based web service, which have accumulated millions of questions and their posted answers over time. Thus, question answering becomes an essential problem in CQA sites, which ranks the high-quality answers to the given question. Currently, most of the existing works study the problem of question answering bas...

متن کامل

Detecting Duplicate Posts in Programming QA Communities via Latent Semantics and Association Rules

Programming community-based question-answering (PCQA) websites such as Stack Overflow enable programmers to find working solutions to their questions. Despite detailed posting guidelines, duplicate questions that have been answered are frequently created. To tackle this problem, Stack Overflow provides a mechanism for reputable users to manually mark duplicate questions. This is a laborious eff...

متن کامل

An Unsupervised Approach for Low-Quality Answer Detection in Community Question-Answering

Community Question Answering (CQA) sites such as Yahoo! Answers provide rich knowledge for people to access. However, the quality of answers posted to CQA sites often varies a lot from precise and useful ones to irrelevant and useless ones. Hence, automatic detection of low-quality answers will help the site managers efficiently organize the accumulated knowledge and provide high-quality conten...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Sci.

دوره 302  شماره 

صفحات  -

تاریخ انتشار 2015